In this paper, we describe our recent findings in interlinking the ArCo Italian cultural heritage entities to the well known Getty Art and Architecture (GVP) Thesaurus through the automated extraction of candidate entities from textual descriptions and the subsequent pruning of ambiguous out-of-domain entities using Neural Word Sense Disambiguation. The disambiguation task is particularly complex since, as detailed in this paper, we map Italian entities in the Arco cultural heritage onto lexical concepts in English (such as those in the GVP Thesaurus). To date, the majority of entity linking and word sense disambiguation systems are designed to work with English and to operate with general purpose sense inventories and knowledge bases, such as DBpedia, BabelNet and WordNet. To address this challenging entity linking and disambiguation task, we adapted a state-of-the-art Neural Word Sense Disambiguation to work in this multi-language setting. We here describe our adaptation process and discuss preliminary experimental results.

Neural Word Sense Disambiguation to Prune a Large Knowledge Graph of the Italian Cultural Heritage / Faggiani, Erica; Faralli, Stefano; Velardi, Paola. - 1652:Communications in Computer and Information Science(2022), pp. 593-604. (Intervento presentato al convegno European Conference on Advances in Databases and Information Systems tenutosi a Torino, Italy) [10.1007/978-3-031-15743-1_54].

Neural Word Sense Disambiguation to Prune a Large Knowledge Graph of the Italian Cultural Heritage

Faralli, Stefano;Velardi, Paola
2022

Abstract

In this paper, we describe our recent findings in interlinking the ArCo Italian cultural heritage entities to the well known Getty Art and Architecture (GVP) Thesaurus through the automated extraction of candidate entities from textual descriptions and the subsequent pruning of ambiguous out-of-domain entities using Neural Word Sense Disambiguation. The disambiguation task is particularly complex since, as detailed in this paper, we map Italian entities in the Arco cultural heritage onto lexical concepts in English (such as those in the GVP Thesaurus). To date, the majority of entity linking and word sense disambiguation systems are designed to work with English and to operate with general purpose sense inventories and knowledge bases, such as DBpedia, BabelNet and WordNet. To address this challenging entity linking and disambiguation task, we adapted a state-of-the-art Neural Word Sense Disambiguation to work in this multi-language setting. We here describe our adaptation process and discuss preliminary experimental results.
2022
European Conference on Advances in Databases and Information Systems
cultural heritage; knowledge graphs
04 Pubblicazione in atti di convegno::04b Atto di convegno in volume
Neural Word Sense Disambiguation to Prune a Large Knowledge Graph of the Italian Cultural Heritage / Faggiani, Erica; Faralli, Stefano; Velardi, Paola. - 1652:Communications in Computer and Information Science(2022), pp. 593-604. (Intervento presentato al convegno European Conference on Advances in Databases and Information Systems tenutosi a Torino, Italy) [10.1007/978-3-031-15743-1_54].
File allegati a questo prodotto
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/1662591
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact